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ABSTRACT 

A description of the parameter estimation formulas that are currently 
in use for the orbit determination of space vehicles is presented. The 
weighted least-squares-estimator with the inclusion of a priori param- 
eter information is defined, and its relation to the maximum-likelihood- 
estimator is discussed. Also, the Schmidt-Kalman estimator is described 
and compared with the least-squares-procedure, showing under what 
conditions the two formulas are mathematically equivalent. 


I. INTRODUCTION 


The practical determination of the orbit of a space 
vehicle from radar or optical data involves an overde- 
termined system. On the one hand, the orbit is defined 
by a number of parameters which include six orbital 
parameters, for example, the initial conditions in the 
Newtonian equations of motion, as well as several physi- 
cal constants that enter into the description of the motion. 
On the other hand, the number of radar observations, 
for example, may be on the order of several thousand. 
The problem then is to define an estimation formula that 
will map these several thousand observations onto some- 
thing on the order of a six dimensional space defined by 
the parameters of interest. 


A selection of any estimation procedure can never be 
satisfactory if it is expected that the procedure will give 
the best representation of the actual orbit for all mis- 
sions. It is possible to justify the selection of a particular 
estimation procedure only on purely statistical grounds, 
and then a number of idealizations are always re- 
required before an estimator can be described as 
optimum. Therefore, in this Report a definitive approach 
is taken with regard to estimation formulas, with em- 
phasis on a description and comparison of methods. 
Information on the theory of estimation may be obtained 
from the applicable publications given in the list of 
References. 


II. MAXIMUM LIKELIHOOD ESTIMATION 


There is in the theory of probability a formula known 
as Bayes’s theorem which relates conditional probabilities 
as follows: 

p( x |z) =^p(z|x) (1) 


The notation p(A|B) indicates the conditional probability 
that A will occur given that B occurs, while p(A) indicates 
the unconditional probability that A will occur. In terms 
of orbit determination p(z | x) gives the probability that 
a data vector z occurs given that an arbitrary parameter 
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vector x occurs. The probability of obtaining this param- 
eter vector is given by p(x), and the probability of obtain- 
ing the data vector z is p(z). If any parameter vector x 
is equally likely, then p(x) will be a constant. However, 
it is usual to favor values of x that are related to the 
physics of the problem so that p(x) is a variable. For 
example, let x represent injection conditions of a space 
probe launched from the Earth. The tendency would be 
to assign low probabilities to values of x that resulted 
in a launch below the surface of the Earth, for example, 
or to values beyond a certain height above the surface 
of the Earth. On the other hand, values of x near the 
designed injection conditions might receive fairly high 
probability values. 

In any case, if p(z | x) and p(x) are known then p(x | z) 
can be evaluated as a function of x, and the probability 
that an arbitrary parameter vector occurs given that the 
data vector z occurs is known. Certainly z occurs, so the 
logical choice of a parameter vector x is one x* which 
absolutely maximizes p(x|z). Because x* is the vector 
that has the greatest probability of occurring, it is called 
the maximum likelihood estimate of x. 

A stationary value of p(x|z) can easily be obtained by 
considering the variation of Eq. (1) and by recognizing 
that p(z), the probability of obtaining the particular data 
vector z, is simply an invariant number P. 

rap(*|i) = U(i|*)^ + Kx>4^] 8 * 

(2) 


Then, because 8x is an arbitrary variation, the estimate x* 
of the parameter vector is the value of x that results in 
a zero coefficient of 8x. 


/a i ^ rfp(x*) , / dp(z|x*) _ A 

^~ t ^r L + v ^ — 0 


dx 


(3) 


The derivative dp(x)/dfx is that of a scalar p with respect 
to a vector x and is therefore a vector. As a matter of 
notation when x* is inserted in the place of the argument 
x the meaning is that dp(x)/dx is evaluated at x*, and 
similarly dp(z|x*)/dx means that dp(z|x)/dxis evaluated 
at x*. 


In practice the probability functions p(x) and p(z|x) 
are not known, and in order to obtain an estimate x*, an 
assumption of the probability distribution is required. 
For a normal or Gaussian distribution on p(x) it is possi- 
ble to write 



Vet (TV) 

2tt 




(4) 


This is the normal distribution function for L variables 
x u x 2 , * * * , x L taken as elements of the column vector^ 
and having as a covariance matrix the L X L matrix T x . 
The notation Det(A) expresses the determinant of A. 

In a normal function such as p(x) the particular value 
of x designated by x; is called the mean value of x, which 
is the value of x that has the highest probability of occur- 
ring. For the case where x represents orbit parameters, x 
represents the most likely description of the orbit without 
the introduction of the observation vector z, and V T 
represents the confidence that exists in this independent 
determination of x. For example, ‘x might be the designed 
or preflight value of x before observational data are ob- 
tained, and, in fact, for this reason it is often called the a 
priori estimate of the orbit parameters. Similarly V x might 
be some assigned covariance matrix on this estimate, and 
so correspondingly it is called the a priori covariance 
matrix on the parameters. The matrix T x might be con- 
structed by considering the errors on the guidance sys- 
tem that establish the designed trajectory, or, in other 
words, by converting error sources in the guidance system 
to errors in the injection position and velocity. Naturally 
such an analysis would be expected to result in correla- 
tions in the six position and velocity elements, and so T x 
would in general be nondiagonal. 

An objection to using an a priori covariance matrix 
based on a guidance error analysis is easily raised by 
considering the meaning of an error or standard deviation 
on a guidance component. To obtain an empirical esti- 
mate of the error on such a component it is repeatedly 
subjected to flight conditions and by analyzing the many 
samples obtainedjts probability function is determined. 
Thus the matrix F x is representative of an estimation of 
the variances and covariances arising from an averaging 
of all possible flights designed to yield a certain set of 
injection conditions. On the other hand the estimate x* 
of the parameters is obtained from only one flight, namely 
the one associated with the observation vector z, and 
thus to avoid an inconsistancy the a priori estimate x" and 
its error matrix F x should be influenced by the particular 
flight in question. Therefore, any in-flight monitoring of 
the guidance system should be heavily weighted in deter- 
mining x* and T x . For example, if a guidance component 
had clearly failed the predicted a priori parameter, vector 
x could be considerably different than the designed value. 
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Actually, the use of the a priori estimate and statistics 
is fairly limited in the determination of the injection con- 
ditions of a space probe. Usually a nominal injection with 
a very large covariance matrix T x is assumed. Thus, in 
the early orbit determination, when relatively few data 
are available, the solution to the estimate is held within 
bounds. 


The greatest application of the a priori information is 
made in later estimations when a fair orbit has been 
determined from independent earlier data. For example, 
if the data vector z represented observations made from 
a spacecraft in the vicinity of a distant planet and 
the parameter vector was the orbital parameters of the 
spacecraft with respect to that planet, then the a priori 
estimate of these parameters would be the determination 
made from earlier Earth-based observations, and the 
covariance matrix T r would give the error description of 
that determination. This will become more apparent 
later on, however, when the estimation formula for x* 
is fully developed and applied to the orbit determination 
problem. 

For the moment it is necessary to consider an expres- 
sion for p( zjx) based on a hypothetical normal distribu- 
tion. Because it is assumed that the data vector z has a 
mathematical rep esentation z = z(x), a reasonable dis- 
tribution function is 


p(*! x ) = ex p ~ z W ] 

IY[z-z(x)]| (5) 


This function p(z|x) says that if one selects a particular 
x then the probability of z occurring can be computed, 
and, in fact, the implication is that of all possible values 
of x, there is one value that maximizes this probability. 
The rate at which the probability of obtaining z de- 
creases as x departs from this maximizing value is gov- 
erned by the N X N data covariance matrix T z . If the 
function p(x) were not involved in the development 
then the value of x required for a maximum probability 
p(z|x) would be the appropriate estimate of x. Instead, 
Eq. (3) is applied to p(x), and p(z | x) and the estimate x* 
differs from the value that maximizes p(z|x) alone. In 
other words, the estimate of x is influenced by its a priori 
value. The derivatives of the probability functions are 


= V (z|x) AT; 1 [z - Z(x)] (7) 

where A is an N X L matrix with elements A,,/ = dz n /dxi. 

A substitution of Eq. (6) and (7) in Eq. (3) results in 
the formula for the estimate x* of the parameters 

A T ( x *) rf [z — z(x*)] + T' 1 (x — x*) = 0 (8) 

As pointed out at the beginning of the development of 
the above estimation formula, no rigorous justification of 
its validity in the sense of being optimum or best is in- 
tended. The formula could have been written down at 
the outset as a definitive relation for the estimate x*, and 
what follows would have proceeded without any discus- 
sion of the meaning of Eq. (8). Hopefully, however, this 
presentation of the formula made it a little easier to 
accept. For a more mathematical approach there are 
several informative works on the theory of estimation 
which describe in detail the properties of estimation 
formulas, such as Eq. (8), and which discuss the condi- 
tions for “best” estimates (e.g., Ref. 2 and 4). 

In any case, the size of the data vector z must be 
considered in the establishment of a computer program 
to solve for x*. In particular, if the size of z is large 
it is clear that the matrix T z , which represents the as- 
sumed covariance matrix on the data, should be given in 
an easily invertible form. Customarily r~ is assumed diag- 
onal so that the data are taken to be uncorrelated and 
the matrix T-" 1 is used directly in the form of a weighting 
matrix W. The elements of W may be simply the inverse 
of the squared error or standard deviation on each data 
sample, or if correlation is known to be present, the corre- 
lated points can be assigned smaller weights than the 
strictly independent data. The important point is that in 
constructing an orbit determination program it is agreed 
beforehand that the data will be weighted with a dia- 
gonal matrix W, and the numerical problem of inverting 
large matrices disappears. A new analytical problem is 
created, however, in that a weighting matrix must be 
chosen that is optimum in some sense and that reflects 
the peculiarities of the particular data obtained. The 
problem of finding an optimum weighting matrix is not 
considered here. 

The other problem of including known data error de- 
scriptions in the matrix W is one that can be solved only 
by analyzing the relevant observational equipment in 
conjunction with experimental data obtained from its use. 


dpi?) 

rfx 


= Pi^Wx 1 ( x - x ) 


(6) 


In addition to the data weighting matrix W = r ~ _1 
there is a weighting matrix IV 1 for a priori parameters. 
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The size of this matrix is such that it can be inverted if 
its determinant is significantly different from zero. In 
other words, orbit determination problems are concerned 
with the estimation of typically six orbital parameters, 
although a number of physical constants can be added 
as well. However, it is always feasible to limit the num- 
ber of parameters to some reasonable number. Therefore, 
F x is assumed to be the actual nondiagonal covariance 


matrix on the a priori parameters x and the estimation 
formula for x* is written 

A T (x*) W [z — z(x*)1 + T" 1 (x — x*) = 0 

(9) 

Formula (9) defines the weighted least-squares estima- 
tion procedure with the inclusion of a priori parameter 
information. 


111. SOLUTION OF THE LEAST-SQUARES-ESTIMATION 

FORMULA 


A system of L nonlinear algebraic equations in L un- 
knowns (* 1 * x 2 * • • • , *„*), is represented by Eq. (9) and 
the solution of this system for the parameter estimate 
x* naturally follows an iterative scheme. The procedure 
followed here applies the Newton-Raphson iteration for- 
mula for L variables. If a solution to a system of equa- 
tions F(x) = 0 is sought where F and x are both L 
vectors then the iteration formula is 

x (n+1) = x<”> - (f) 1 F (x<*>) (10) 

where x (n) is the nth approximation to the solution 
x and x (n+1) is the improved (n + l)st approximation. 
The matrix 0F/0x is normally evaluated at x (n) although 
the procedure can be modified by evaluating 0F/0X only 
once at the initial approximation to x. (It should be noted 
that the derivative of a vector with respect to a vector is 
introduced as a notational convenience only.) 

A discussion of the convergence properties of Eq. (10) 
is fairly involved if the dimension of the system is greater 
than unity, although for the special case where x is a 
scalar it can be shown quite easily that the iteration 
formula is second order. In other words, the error in the 
(n + l)st approximation to the solution is proportional, 
to the square of the error in the nth approximation, and 
if the initial guess to x is good enough, convergence is 
assured. In practice, this sort of statement applies to the 
L dimensional system of equations as well, and for this 
reason the Newton-Raphson formula is used in the orbit 
determination problem. The convergence properties of 


Eq. (10) are not discussed here, although any standard 
numerical analysis text can be consulted for information 
on convergence. 


Now the function F(x*) is simply the left-hand side 
of Eq. (9), and all that is required to apply formula (10) 
is the matrix 0F/0x or, equivalently, the derivative of 
Eq. (9) with respect to the variable x. In other words, 
the function F(x) is 

F (x) = A T (x)W |"z z(x)l + T" 1 (x — x) 

( 11 ) 

and 

|| = - A r (x)WA(x) - rV (12) 

The variation of A(x) with x has been neglected in 
Eq. (12). Its inclusion in 0F/0x results in considerable 
complication because of the way it enters and because 
it requires the second partial derivatives of the data vec- 
tor z with respect to the parameters x. Experience has 
shown that the iteration formula, which results from a 
neglect of the variation in A(x), is satisfactory in provid- 
ing rapid convergence to the solution x* of Eq. (9). 
Therefore, the contribution of the second partial deriva- 
tives is only noted here before proceeding to the usual 
estimation formula. Designate the second derivatives of 
the observation vector z(x) with respect to the r and s 
components of x by the vector d r8 . 


d 


rs 


3 2 z 

dx r dx s 


(13) 
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Then to include d rs in the iteration formula each r,s ele- 
ment (0F/0x) rs of the matrix 0F/0x is modified so that 

(f) s = - [A r (x)WA(x) + r- 1 ] „ + d'W[ z - z(x)] 

(14) 


The final iteration formula thus assumes that the second 
term in Eq. (14) is small and there results the standard 
relation 

x (» + i, _ x (»> = (x (n) )WA(x (n) ) + r- 1 ]- 1 |A T (x<’*>)W 

[z - z(x<”>)] + r- 1 (x - X<»>)1 (15) 


IV. SCHMIDT-KALMAN ESTIMATION 


An interesting variation of formula (15) has been used 
by Schmidt, Smith and others in the determination of 
orbits by a continual updating of the estimate of the 
orbit. In other words, at some observation time t the 
a priori estimate x of the orbit, as obtained from pre- 
vious data, is improved by adding the observations taken 
at time t. An immediate application of this method (also 
known as the method of sequential estimation) is in the 
navigation of a manned space vehicle where the orbit is 
periodically re-established by taking one or more obser- 
vations from the vehicle and by processing the data 
through a small on board computer. 

The form of the estimator was established by Kalman 
with the aid of linear statistical filter theory, and, there- 
fore, to derive his formula from the least-squares formu- 
lation it is necessary to linearize formula (15) by setting 
the estimate x* of x equal to x (n+1) and the a priori 
estimate x of x equal to x (n) . This is justified if the 
a priori estimate of the orbit is sufficiently close to the 
new estimate x*, or, equivalently, if the new data are 
sufficiently consistent with the past data so that only one 
iteration is required in formula (15). If this is not the 
case, if, for example, a relatively uncertain midcourse 
maneuver were applied just before the new observation 
time, then the estimate of the orbit would deviate from 
the least-squares estimate that satisfies Eq. (9). Further, 
because x (n) is assumed equal to x the second term in 
the brackets of Eq. (15) is always zero, and a repeated 
application of the Schmidt-Kalman estimator would not 
result in convergence to the solution of Eq. (9). How- 
ever, with these reservations in mind the linear form of 
formula (15) is written 

8x = x* - x = (A T WA + T; 1 )- 1 A T WS z (16) 


where, of course, the terms on the right-hand side of the 
formula are evaluated at x = x and the residual vector 
Sz is 

8z = z — z(x) (17) 

Note that for one observation and six parameters both 
Sz and W are scalars, T x is a six by six matrix and A is 
a six-dimensional row vector. The Schmidt-Kalman form 
of the correction 8x is obtained by writing the matrix 
(A T WA + IV 1 ) -1 in a different form. 

Consider the matrix 

M = (I? w) < 18 > 

If the dimension of T x is L X L while that of W is N X N, 
then the matrix A is of dimension N X L while that of 
M is (L + N) X (L H- N). If an inverse to M exists then 
it can be written 

"-■=(£' £) < 19 > 

where the dimensions of X u X 2 , and Y are respectively 
L X L, N X N and N X L. 

Now 

MM- 1 = I M (20) 

where I n indicates the n X n unit matrix. Forming the 
matrix products involved in MM 1 yields 

T; 1 X, + ATT = I L (21) 

AY T - W~ X X 2 = I N (22) 
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r- 1 Y T + A t X 2 = 0 (23) 

AX, ~ W 1 Y = 0 (24) 

From Eq. (24), if the inverse of W 1 exists, then 

Y = WAX, (25) 

and if the inverse of IV 1 also exists, then Eq. (21) and (25) 
yield 

(A T WA + r- 1 )X 1 = I L (26) 

or for ( A T WA + IV 1 ) nonsingular 

X, = ( A T WA + r - 1 )-' (27) 

a second expression for X, can be obtained from Eq. (21), 
(22) and (23). 


tion that there are no systematic errors in the residual 
vector Sz. However, for the moment it will be assumed 
that Eq. (36) is valid and write 

Sx = r*A T W8z (37) 

The expression rVA T W can be simplified. Set 

A = T X A T W (38) 


Then 

a = rrv — r x A T (w- 1 + Ar^A 7 )- 1 Ar,] a t w 

(39) 


or 

a = r x A T rw - (w- 1 + Ar.A 7 ’)- 1 at t a t w] 

(40) 

Further 


From Eq. (23) 

Y T — — r*A T X 2 (28) 

and X 2 can be found from Eq. (22). 

X 2 = - (W- 1 + Ar< r A T )- 1 (29) 

Substitute X 2 in Eq. (28) to obtain 

Y = (W- 1 + Ar^A 7 )" 1 AT, (30) 

where T x , W and AT X A T are all symmetric matrices. 


a = r^A r (w- 1 + Ar I A r )- i [(w- 1 +Ar,A r )w 

- Ar x A T W] 


(41) 

and 


A = I\A T (W- 1 + AI^A 7 )- 1 

(42) 

so that 


* 

< 

II 

<1 

(43) 

and the estimation formula becomes 


8x = A*8z 

(44) 


Now solve Eq. (21) for X, 

X, = - T X A T (W- 1 + Ar^A 7 )" 1 AT X (31) 

and simply equate the two expressions for X,. 

(a t wa + r - 1 Y 1 = r x - v x a t ( w - 1 + at^y 1 af x 

(32) 

or define 

A* = T X A T (W- 1 + AV X A T Y' (33) 

so that 

(A T WA + r - 1 Y 1 = r * - A * Ar , (34) 

then the estimator Sx from formula (16) can be written 

8x = (r* - A *AK) A T W8z (35) 

Schmidt and Kalman call (T x — A*AT ar ) the covariance 
matrix F x on the parameters after the data z are added. 

r * = (a t wa + r - 1 Y 1 = r *~ A * Ar - ( 36 ) 


The procedure for continually updating an orbit can 
be expressed as a closed loop process which flows accord- 
ing to the following steps. 

Step 1 . Given an estimate of the orbit x* at time t inte- 
grate the equations of motion of the vehicle to a time 
t + A t, the next observation time. This yields an a priori 
estimate x of the orbital parameters at time t + A t, as 
well as the residual vector 8z = z — z (x). 


Step 2. Map the covariance matrix F x at time t to the 
new time t + A* to yield the a priori covariance matrix 
T x at time t + At. Again if systematic errors are neglected 

T, = UF X U T 


where U is a matrix that relates differential increments 
in the orbital parameters referenced to time t + At to 
parameters referenced to t. The r, s element of U is 


_ dX r {t + At) 

~ dx 8 (t) 


r,s = 1,2, * * • ,L 


It will be shown in a later part of this sequence of reports For example, x(t + At) represents the Cartesian compo- 

that T* is the covariance matrix only under the assump- nents of position and velocity at time t + A t, while x(t) 


6 


JPL TECHNICAL REPORT NO. 32-498 


represents the position and velocity components at time t. 
In this case U is a six by six matrix. 


Step 4. Compute the weighting function 


a* = r x A T (w- 1 + aPsA 7 )- 1 


Step 3. Compute the matrix A. Because the parameters 
x are referenced to the observation time t + At, the A 
matrix simply contains the partial derivatives of the ob- 
servation vector with respect to the parameters at time 
t + At. 


Step 5. Compute an improved estimate of the orbit at 
time t + At 

x* = x + A*8z 



Step 6. Compute the new covariance matrix for the 
estimated parameters x*. 


The number N is greater than or equal to unity but less 
than some number which represents the upper limit for 
a feasible numerical N xN matrix inversion in step 4. 


r, = r* - a*aP* 


Step 7. Set t -\- At equal to t and go back to step 1. 
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